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Abstract 


We  consider  a  model  sensitivity  problem  of  a  dependent  variable  on  several  exogenous 
variables  while  the  dependent  variable  has  some  missing  data.  Under  certain  assumptions 
on  the  model  of  selected  sample  and  on  the  selection  mechanism,  a  mixture  model  is  derived 
and  some  statistical  properties  are  discussed.  This  model  gives  a  way  to  derive  Pearson- 
Lawley  (PL)  correction  formula  for  the  covariance  matrix  and  leads  to  a  modification  when 
the  missingness  is  not  ignorable.  A  sensitivity  analysis  is  then  discussed  for  the  PL  method. 
Finally,  this  modified  PL  method  is  applied  to  a  real  data  set  from  Project  A  of  Office  of  Naval 
Research.  The  results  show  some  difference  from  that  of  using  Pearson-Lawley  method  or  of 
using  Hstwise  deletion. 

KEY  WORDS:  sensitivity  analysis,  nonignorable  missingness,  Pearson-Lawley  (PL)  for¬ 
mula,  modified  PL  method. 

RUNNING  TITLE:  Sensitivity  Analysis  for  PL  Corrections 
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1  Introduction 

The  selection  of  individuals  is  common  in  educational  institutions,  cooperations,  and  military 
organizations,  in  this  situation,  a  very  important  issue  is  to  establish  a  model  for  a  dependent 
variable  like  job  performance  on  some  exogenous  variables  like  test  scores  and  other  background 
variables  for  the  population  such  that  the  prediction  and  its  validity  can  be  studied.  Since  only 
.  those  being  selected  can  have  measurements  of  job  performance,'  how  to  deal  with  the  selection  of 
candidates  or  missingness  of  the  performance  measurements  of  unselected  ones  is  the  problem  we 
want  to  study  in  this  paper. 

Iii  terms  of  the  missingness,  there  are  two  basic  types  of  missing  mechanism  (Little  k  Rubin, 
1987)  depending  on  the  relationship  of  the  missingness  and  the  dependent  variable  of  the  unselected 
candidates.  One  is  called  missing  at  random  (MAR)  or  ignorable  nonresponse.  In  this  case,  a 
candidate  being  selected  or  unselected  is  independent  of  his  performance  measurements  but  may 
depend  on  the  exogenous  variables.  This  missingness  is  ignorable  in  the  sense  that  estimates  can 
be  obtained  based  on  the  likelihood  function  of  the  observed  sample  and  ignoring  the  missing 
observations.  The  other  missing  mechanism  is  called  nonignorable  missingness,  in  which  case  an 
individual  being  selected  or  unselected  may  depend  on  the  performance  measurements. 

For  the  first  mechanism,  or  MAR,  many  statistical  techniques  have  been  proposed  (see,  e.g., 
Little  k  Rubin,  1987;  Rubin,  1987).  Among  those  methods,  the  simplest  one  is  that  of  using  only 
the  observed  sample  to  do  statistical  inference.  This  is  known  as  listwise  deletion  in  the  missing  data 
literature.  In  addition  to  listwise  deletion,  there  are  many  regression-based  adjustment  methods  like 
Pearson-Lawley  correction,  the  maximum  likelihood  procedure,  and  multiple  imputation  techniques 
(Rubin,  1987).  All  of  these  methods,  except  listwise  deletion,  often  give  good  statistical  inference 
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for  MAR  cases. 


When  the  data  is  not  missing  at  random,  the  situation  is  more  complicated.  It  is  known  that 
methods  assuming  MAR  may  often  be  quite  biased  when  the  missingness  is  nonignorable,  Heck¬ 
man  (1976)  proposed  a  selection  model  which  assumed  a  missing  data  mechanism  in  terms  of  a 
conditional  probability  of  missing  or  not  missing  given  the  observed  measurements.  A  least-square 
correction  was  proposed  by  Olsen  (1980).  Lee  (1982)  gave  some  approaches  using  a  transforma¬ 
tion  bcised  on  a  bias  function.  Muthen  and  Joreskog  (1983)  pointed  out  that  nonlinearity  and 
heteroscedasticity  might  occur  in  nonrandomly  selected  samples  even  though  the  population  it- 

r 

self  was  normally  distributed.  Recently  Little  (1994)  proposed  two  unified  models  for  the  data 
and  missing  mechanism,  which  include  random- coefficient  selection  model  and  random-coefficient 
pattern-mixture  model. 

‘  It  is  often  necessary  to  make  some  assumptions  about  the  missing  mechanism  for  obtaining  some 
statistical  inference  on  a  data  which  involves  nonignorable  missingness.  In  this  case,  a  sensitivity 
analysis  is  interesting  which  show  how  much  outcomes  are  affected  by  the  assumptions  of  the 
missing  mechanism.  Allen,  Holland,  and  Thayer  (1994)  discussed  such  a  sensitivity  analysis  for  a 
mixture  model  (Rubin,  1987)  and  simplified  selection  modeling.  Brown  and  Zhu  (1994)  explored 
several  families  of  nonignorable  missing  mechanisms  and  proposed  a  compromise  solution  which 
provides  some  protection  against  nonrandomness. 

In  this  paper,  we  study  the  selection  problem  with  one  performance  variable  and  several  ex¬ 
ogenous  or  covariate  variables.  A  typical  example  is  from  military  enlistment  and  job  assignment, 
where  the  hands-on  job  performance  is  of  primary  concern,  while  selection  is  based  on  test  scores 
and  other  background  variables.  In  this  situation,  the  Pearson-Lawley  correction  for  the  covariance 
matrix  is  commonly  used  for  validity  eissessment. 

We  first  discuss  model  specification  and  some  statistical  properties  in  Section  2.  In  Section  3,  a 
modification  of  the  Pearson-Lawley  method  is  derived  for  the  case  with  nonignorable  missingness. 
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This  method  gives  the  same  formula  as  the  Pearson- Lawley  under  the  MAR.  In  Section  4,  a 
sensitivity  analysis  is  presented  to  show  how  much  outcomes  are  affected  by  the  assumptions. 
Some  discussion  is  given  on  how  to  choose  the  unknown  parameters  according  to  the  possible 
information.  At  last,  this  method  is  applied  to  a  real  data  set  from  Project  A  of  Office  of  Naval 
Research  (ONR).  Comparing  with  the  PL  method  or  the  listwise  deletion,  the  modified  Pearson- 
Lawley  method  gives  some  different  conclusion. 


2  Model  specification 

Suppose  y  is  the  performance  measurement  of  interest  and  x  is  a  vector  of  exogenous  variables 
related  to  selection.  Furthermore,  we  assume  that  x  is  observed  always,  no  missing  data  occurs; 
but  y  is  observed  only  if  the  individual  is  selected.  Usually  there  is  no  information  available  for  y 
in  the  unselected  sample.  Because  of  this,  we  may  only  make  a  model  assumption  for  the  selected 
sample.  • 

Let  R  be  an  indicator  of  selection  such  that  =  1  if  a  candidate  is  selected  and  y  is  observed, 
and  R  =  0  if  the  candidate  is  unselected  and  y  is  missing. 


2.1  A  mixture  model 

Without  loss  of  generality,  we  assume  that  the  means  of  y  and  x  are  all  0,  otherwise  they  can  be 
centralized  by  transformations.  Then  we  assume  that  when  R  =  1, 

[j/|i?  =  1]  =  X/?  +  €,  e~7V(0,<T2)  (1) 
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for  some  parameters  /?  and  <t^.  This  assumes  a  normal  distribution  for  th  conditional  distribution 
of  y  given  x  and  when  y  is  observed. 

One  advantage  of  only  making  assumption  on  the  selected  sample  is  that  this  assumption  can  be 
checked  since  we  have  observations  for  both  y  and  x  in  the  selected  sample.  Note  that  the  model  of 
(1)  may  be  quite  different  from  assuming  a  linear  model  and  normality  over  the  whole  population 
as  in  Heckman  (1976),  Olsen  (1980),  Lee  (1982),  or  Muthen  and  Joreskog  (1983).  As  Muthen  and 
Joreskog  pointed  out,  when  the  whole  population  follows  a  linear  model  with  normal  residuals,  a 
nonrandom  selection  procedure  results  in  the  model  of  the  selected  sample  being  neither  linear  nor 
normal. 

Now  suppose  there  is  a  selection  variable  s  (a  latent  variable)  such  that  for  some  function  g{-), 


S  =  flr(x)  +  6, 


if  s  >  0 


0  otherwise 


(2) 


where  ^^(x)  contains  all  contribution  to  the  selection  from  the  exogenous  variable  x  and  <5  is  a 
residual  term  which  may  be  viewed  as  the  contribution  from  something  other  than  x.  This  8  is 
not  observed  and  may  depend  on  both  y  and  x. 

Let  [t/]  be  the  distribution  of  y.  This  notation  may  be  a  cumulative  probability  function,  or  a 
probability  mass  function  when  y  is  discrete,  or  a  density  function  when  y  is  continuous.  Under 
the  above  assumptions,  we  have  a  mixture  model 


[y|x]  =  [R=  l|x][y|x,  =  1]  +  [i?  =  0|x][y|x,  R=:Q]  (3) 

where  [y|x,  i?  =  0]  is  a  distribution  of  y  given  x  for  the  unselected  sample.  This  model  has  been 
proposed  by  Glynn,  Laird  and  Rubin  (1986)  and  is  named  a  mixture  model.  Recently,  Allen, 
Holland  and  Thayer  (1994)  applied  a  similar  model  to  nonignorable  nonresponse  problems  for  a 
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discrete  variable  y. 

With  notation  as  before,  let 

p  =  P(i^=:l|x) 

be  the  selection  rate  for  given  exogenous  variables  x,  then 


[2/lx]  =  p[y|x,  =  1]  +  (1  -  p)[y|x,  R^O] 


With  some  probability  calculation,  we  have  the  following  result. 


Result  1.  Let  [j/|x],  [j/|x,  i2  —  0],  [j/|x,  i?  =  1]  be  corresponding  probability  mass  or  density 


functions.  Then 


[y|x,  i?  =  0] 


[ie  =  l|x]  [i?=  0|p,x] 
[i2  =  0|x]  [i2=  l|2/,x] 


1] 


fvixi  =  IW  _  [t/|x,j;=  i]p 

[ie  =  i|p,x]  [i2=i|p,x] 


(4) 

(5) 


When  y  is  a  discrete  response  variable,  this  result  and  a  proof  has  been  given  in  Allen,  et.  al. 
(1994).  A  similar  argument  can  be  used  for  the  case  when  y  is  a  continuous  variable  and  leads  to 
the  Result  L 


When  the  missingness  is  at  random,  we  have  [i?  =  0|y,x]  =  [R  =  0|x]  and  [i?  =  l|y,x]  = 
~  4|x].  Hence  from  (4),  the  distribution  of  y  given  x  for  the  whole  population  is  the  same  as 
that  of  the  selected  sample.  However,  for  many  situations,  the  missingness  may  not  be  ignorable, 
that  is  [ii  =  0|y,  x]  may  not  be  independent  of  y.  In  this  case,  we  need  to  know  the  distribution  of 
y  given  x  and  =  0  given  x. 

The  result  of  (4)  gives  a  relationship  for  the  distribution  of  y  given  x  among  those  unselected 
and  the  distribution  of  y  given  x  among  those  selected.  The  [iZ  =  l|y,x]  specifies  a  selection 
mechanism,  or  similarly  =  0|y,x]  is  a  missing  data  mechanism,  for  which  we  need  to  make 
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assumptions.  The  result  of  (5)  expresses  the  distribution  of  y  given  x  for  the  whole  population  .as 
that  of  the  selected  sample  and  the  selection  mechanism. 

After  assuming  a  model  for  the  selection  mechanism,  the  Result  1  will  lead  us  to  a  model  for 
the  full  population.  A  sensitivity  analysis  will  show  how  much  conclusions  for  the  model  of  ?/]x 
may  be  affected  by  varying  these  assumptions.  This  is  discussed  in  the  later  Sections.  ! 


2.2  A  logistic  selection  mechanism 

Assuming  (2),  we  have 

[R  =  l|y,x]  =  P[s  >  0|y,x]  =  P[S  >  -y(x)|y,x] 

Hence,  the  selection  mechanism  requires  a  conditional  distribution  of  residual  of  the  selection 
variable  s  after  given  y  and  x. 

To  be  explicitly  workable,  we  will  take  a  quadratic  logistic  model  for  the  conditional  distribution 
of  S  given  y  and  x, 


[S  >  0|y,x]  = 


_ _ 

1  +  exp(-«(Ao(x)  +  Ai(x)y  +  A2(x)y2)) 


(6) 


where  k  ~  Ao(x),  Ai(x)  and  A2(x)  are  coefficients  which  may  depend  on  x.  Under  this 

assumption,  we  have 


Result  2.  If  [(5|y,x]  is  given  as  in  (6),  then 


[iJ=  l|2/,x]  = 


_ 1 _ 

1  +  exp(-/c(gr(x)  +  Ao(x)  +  Ai(x)j/  +  A2(x)y2)) 


(7) 
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and 


i^(5|2/,x)  =  Ao(x)  + Ai(x)y  + A2(x)j/^, 


(8) 


V"(5|y,x)  =  1. 


(9) 


2.3  Statistical  Properties  of  the  mixture  model 

ti  .  c  r 

First,  let  us  look  at  the  distribution  of  2/|x  for  the  whole  population.  From  (5)  and  (7),  we  have 
[j/|x]  =  p[2/|x,  i?  =  l]/[i2  =  1  |y,  x] 

=  P/(2/|x,i?=  l)+pexp(-K<flr(x)  +  Ao(x)  + Al(x)y^-A2(x)J/2))/(y|x,i^=  1) 

=  P/(j/|x,i2=  1)  +  (1 -p)/(ylx,i?=  0) 

where  , 

f(y\x,  Rz=0)  =  exp[-«(s(x)  +  Ao(x)  +  Ai(x)y  +  A2(x)y2)]/(y|x,  .R  =  1)  (10) 

must  be  a  density  function.  This  requires  certain  constraints  on  the  parameters  of  Ao(x),  Ai(x)  and 
A2(x).  Without  loss  of  generality,  we  may  vary  Ai(x)  and  A2(x)  but  treat  Ao(x)  as  a  normalization 
parameter.  Then  from  (1)  and  (10),  we  see  that  [y\Xj  R=  0]  follows  a  normal  distribution  with 

Mean:  fi  =  {xf3 / -  kXi{x))t^  (11) 

Variance:  =  (2«A2(x)  +  (12) 

where  2/cA2(x)  -h  >  0  is  a  constraint  for  A2(x).  Therefore,  we  have  the  following  result. 
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Result  3.  Under  the  assumptions  of  (1)  and  (7),  [i/|x]  has  a  mixture  distribution  resulting  from 
two  normal  densities. 

[y|x] -piV(x/?,(72) +  (1 (13) 

where  A2(x)  is  selected  such  that  2«A2(x)  +  l/<r^  >  0.  Moreover,  the  mean  and  variance  of  [y|x] 
is  given  as  follows.  ‘  .  - 

^(2/W  =  +  (14) 

V"(y|x)  p<T2  +  (l^p)r2  +  p(l^p)(x/?-^)2  (15) 

It  is  of  interest  to  look  at  the  correlation  between  y  and  the  selection  variable  s  given  x  (see 
equations  (1)  and  (2)).  This  correlation  is  an  important  indicator  for  whether  the  missingness  is 
ignorable  or  not.  In  fact,  if  the  -missingness  is  ignorable,  then  s|x  is  independent  of  y\  otherwise 
if  cor(y,  s\x)  =0  then  the  selection  might  be  unrelated  to  the  dependent  variable  y  and  we  could 
expect  that  the  missing  is  at  random.  With  some  calculation,  we  have  the  following  result. 

Result  4.  Under  the  assumption  of  (1)  and  (6),  the  correlation  of  y  and  s  given  x  is 

p  =  corr{y,  s|x)  =  cov{y,  ^|x)/A/U(y|x)U(^|x)  (16) 


where  U(y|x)  is  given  at  (15)  and 


cov{y^8\'x)  Ai(x)V'(y|x)  -f  A2(x)cou(y,  y^|x) 


U(^|x)  -  1  +  Ai(x)^U(y|x)  +  A2(x)^U(y^|x)  -h  2Ai(x)A2(x)co^;(y,  y^|x) 
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and 


cov(2/,y^|x)  =  (3p-p^)c7-2x/?+(2-p-p2);ir2  ' 

+P(1  -  P)[(x/?  +  -  nf  -  a^n  -  r^x/?] 

=  (3<t'‘  +  6<72(x/?)2  +  (x/9)4)p  +  (3r1  +  6t^h^  +  p4)(l  -  p)  - 

[p(«^^  +  (x/?)^)  +  (1  -  p)(r^  +  p^)f 

Remark  1.  The  p,  the  correlation  between  y  and  s  given  x,  may  depend  on  x  unless  Ai(x)  and 

V  •  r 

A2(x)  are  constants. 

■f  *  ■ 

3  A  modification  of  the  Pearson-Lawley  formula 

3,1  Pearson-Lawley  correction 

It  is  well  known  that  many  statistical  analyses,  such  as  linear  regression,  factor  analysis,  and 
structural  equation  modeling  can  be  done  using  only  the  mean  vector  and  the  covariance  matrix 
without  having  raw  data.  In  fact,  the  first  two  moments  give  sufficient  statistics  under  the  normality 
assumption.  How  to  get  a  good  estimate  of  the  mean  vector  and  the  covariance  matrix  has  received 
a  great  deal  of  attention  in  statistical  literature. 

When  selection  or  missing  data  comes  in,  how  to  estimate  the  mean  vector  and  the  covariance 
matrix  is  not  straightforward.  Pearson  (1903)  and  Lawley  (1943,  1944)  gave  adjustment  formulas 
for  the  mean  and  covariance  matrix  for  the  population  after  giving  the  selected  sample.  Suppose 
y  is  the  dependent  variable  which  has  missing  data  and  x  are  the  covariates  that  are  observed 
completely.  Let  z  =  (x  ,  y)  ,  then  under  the  assumption  of  MAR  or  the  selection  is  ignorable,  the 
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PL  correction  formulas  are  maximum  likelihood  estimates  without  constrains  on  the  mean  and 
covariance  matrix  of  y  and  x.  Let  and  E*  be  the  mean  vector  and  the  covariance  matrix  based 
on  the  observed  sample.  Then  Pearson-Lawley  correction  is  given  as  follows. 


=  Pz-EzxSixHA'x-Mx)  (17) 

Szz  =  2*2-E^xS;5c'(S;cx-Sxx)S;rxS;cz  (18) 


If  we  decompose  the  matrix  according  to  the  size  of  x  a.nd  y  and  denote 


Szz 


/  \ 

Ell  5]i2. 

^21  ^22 


then  it  is  not  ditBcult  to  find  that 


Sn 

E21 

5^22 


—  Exx 


-^J/X^XX  ^XX 


“■  ^yy  “  ^yxl^xx  ““  ^xx  -^xx^xx  J^xy 


Hence  with  Pearson-Lawley  adjustment,  the  covariance  matrix  for  the  exogenous  variables  is  just 
the  covariance  matrix  obtained  from  the  total  sample.  Adjustment  is  made  only  on  the  covariances 
between  x  and  y  and  on  the  variance  of  y.  In  contrast  to  the  analysis  based  on  the  observed 
sample  only  (  listwise  deletion  ),  this  correction  may  give  significant  improvement  for  the  statistical 
inference. 

The  adjustment  formula  (18)  can  be  derived  from  our  model  assumption  when  missingness  is 
ignorable.  In  fact,  when  missingness  is  ignorable,  the  selection  is  based  on  x  and  is  independent 
of  the  performance  score  y.  In  the  other  words,  the  S  in  (2)  is  independent  of  y.  Then  given  the 
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exogenous  variable  x,  the  selected  sample  and  the  unselected  sample  follow  the  same  distribution. 
Under  the  model  of  (1),  we  have 

ij/|x]  =  [y\x,  =  1]  ~  N{x/3,  cr^) 

Hence  after  having  observations,  we  can  model  y  for  the  total  population  as  follows. 


y  =  X/?*  +  e 


(19) 


where  p*  is  estimated  from  using  the  observed  sample,  €  is  a.  random  variable 

which  is  independent  of  x  and  has  mean  0  and  variance, 


a 


2 


yX^XX  ^Xy 


Then  the  covariances  between  y  and  x  and  the  variance  of  y  can  be  calculated  from  (19), 

Result  5:  When  the  selection  is  ignorable  and  [yjx,  =  1]  follows  the  distribution  of  (1),  we  have 

2.  cov{y,x)  =  E*xSx'xExx« 

3.  cou(t/,  y)  =  —  E*x(^xx^  ~  ^xx^^xx^xx^)^xy  • 

which  is  just  the  same  as  the  Pearson-Lawley  adjustment  formula. 

We  have  to  note  that  all  the  above  procedures  are  based  on  the  assumption  of  selection  being 
ignorable,  that  is  we  have  missing  at  random.  This  is  a  crucial  condition  to  derive  the  Pearson- 
Lawley  formula.  Without  this  assumption,  or  the  missingness  not  being  ignorable,  the  Pearson- 
Lawley  adjustment  may  be  seriously  biased.  A  modified  scheme  is  proposed  as  follows. 
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3.2  A  modification  of  the  PL  formula 

Assume  the  mixture  model  of  (13),  first  we  shall  discuss  more  about  the  choice  of  parameters  of 
Ai  (x)  and  A2(x).  Since  both  of  these  parameters  are  unknown,  we  have  to  specify  them  subjectively. 
There  may  be  no  information  about  them  from  observations.  Hence,  from  the  point  of  view  of 
simplicity  and  plausibility,  it  is  reasonable  to  first  assume  that  A2(x)  is  independent  of  x.  Let 

V  =  (2«A2<r^  +  1)“^  (20) 

Then  by  (12),  =  va^  and 

/i  =  t;(x)3  -  kAi(x)(T^)  =  v[l  -  kAi(x)<t^/(x/?)]x/? 

Now  we  want  to  choose  a  coefficient  Ai(x)  such  that  it  is  proportional  to  x/?  and  let 

m  =  ‘u(l  —  kXi{x)(t^  /  (x^))  (21) 

Then  this  m  is  also  a  constant.  Under  these  assumptions,  the  mean  and  the  variance  of  the 
unselected  sample  is  just  a  scale  transformation  of  the  mean  and  the  variance  of  the  selected 
sample,  that  is,  /i  =  mx/3  and  =  va^. 

In  the  following  discussion,  we  will  use  the  notations  of  m  and  v.  Under  these,  we  have  Ai(x) 
=  _  m)xl3/{vK(7^)  and  A2  =  (1  —  v)/(2vKcr'^).  More  over,  the  formulas  in  Result  3  and  Result 

4  can  be  represented  by  m  and  v  accordingly. 

Now  we  can  give  a  modification  of  the  Pearson-Lawley  formula. 
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Result  6.  Under  the  assumptions  of  (13),  for  given  >  0  and  m,  then 

Cov{y,x)  =  (p+ (1 -p)m)/?'Exx  = /i(m,p,<T^)j0'Exx  (22) 

Cov(y,y)  =  (p+ (1 -p)t;)  cr^ 

+[(P  +  (1  -  +  P(1  -  P)(l  -  to)^]/?'Sxx/? 

=  f2iv,p,<T^)a^  +  f3im,p,<T^)l3''£xxl3  (23) 

where  /i(m,p, <7^),  f2{v^p,(T^)  and  /3(7n,p,cr^)  are  functions  of  the  selection  rate  p,  the  residual 
variance  and  the  parameters  m  and  v.  For  simplicity,  we  will  denote  them  by  /i,  /2  and  /a  in 
the  following  discussion. 

Finally,  a  modified  Pearson-Lawley  formula  can  be  obtained  by  replacing  and  in  (22)  and 
(23)  by  estimates  “  ^yx^xx^^xy  After  considering  the  centering, 

the  mean  of  y  can  be  estimated  as 

=  (P  -h  (1  “  +  Py  -  pIP*  (24) 


4  Sensitivity  analysis 

A  sensitivity  analysis  is  of  interest  to  show  how  much  outcomes  are  affected  by  assumptions  about 
unknown  information  and  parameters.  Such  an  analysis  was  performed  for  a  nonignorable  nonre¬ 
sponse  problem  in  Allen,  Holland  and  Thayer  (1994).  Here  we  will  do  a  sensitivity  analysis  for  the 
Pearson-Lawley  adjustment  and  its  modification  to  the  choices  of  the  parameters  m  and  v. 

Suppose  that  we  are  interested  in  a  regression  analysis  of  the  performance  variable  y  on  the 
exogenous  variables  x  for  the  population.  The  regression  coefficients  and  R?  can  be  obtained  from 
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the  covariance  matrix  between  y  and  x.  According  to  the  previous  discussion,  there  are  three 
versions  of  the  covariance  matrix;  one  is  from  using  the  selected  sample  only,  one  is  from  using  the 
Pearson-Lawley  adjustment,  and  the  other  is  from  using  the  modified  PL  method. 

For  example,  under  the  assumption  of  the  selection  mechanism  (7)  and  the  mixture  model  (13), 
an  ordinary  least-square  (OLS)  estimator  from  using  the  modified  PL  method  would  be  i 

0mpi  =  SxiC'ov(x,y)  =  fifi* 

where  /?*  is  an  estimate  from  using' the  selected  sample  only  and  fi  is  a  factor  defined  in  (22). 

In  the  following  table,  we  list  formulas  of  variance,  covariance,  regression  coefficients  and 
for  using  the  selected  sample  only,  using  the  Pearson-Lawley  method  and  using  the  modified  PL 
method. 


Method 

Cov{x,  y) 

0 

Cov{y,  y) 

F? 

Selected 

sample 

0* 

=  v; 

0^'^XX0*IV,  =  R] 

PL  method 

0* 

0*'^xx0*/Vpi  =  Rl, 

Modified 

PL  method 

fi^xxP* 

fi0* 

f2^*^  +  Sxx^* 

=  Knp/ 

fl0*'^XX0*IVmpl 

—  ^mpl 
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4.1  Sensitivity  of  and  B?. 


Now  let  us  look  at  the  sensitivity  properties  of  the  coefficients  (3  and  B?  for  various  choices  of 
parameters  m  and  v  under  the  model  (13). 

First,  when  m  =  ?;  =  1,  that  is  Ai(x)  =  0  and  A2  =  0,  the  selection  mechanism  at  (7)  depends 
on  X  only.  Hence  the  selection  (or  missingness)  is  ignorable.  The  distribution  of  the  unselected 
sample  is  the  same  as  the  distribution  of  the  selected  sample.  In  this  case,  /i  =  /2  =  /s  ==  1, 
and  the  formulas  of  (22)  and  (23)  become  the  same  as  the  PL  adjustment  formulas.  The  same 
statistical  inferences  can  be  obtained  for  fS  and  B? . 

Now  suppose  m  =  1.  In  this  case,  the  mean  of  t/|x  is  the  same  for  the  selected  and  the  unselected 
samples.  Adjustments  may  be  taken  only  on  the  variance  of  t/|x  for  the  unselected  sample.  From 
Result  6,  /i  =  /s  =  1,  so  there  is  no  adjustment  on  the  covariances  between  t/  'and  x.  Hence  the 
regression  coefficients  are  the  same  for  both  PL  adjustment  and  the  modified. PL  method.  The 
difference  of  the  Vmpi  and  Vpi  is  given  by  (/2  -  1)^*^,  which  is  positive  if  ?;  >  1  and  negative  if 
u  <  1.  Hence  comparing  with  is  an  overestimate  if  u  >  1  and  an  underestimate  if  u  <  1. 

Another  interesting  case  is  when  u  =  1,  that  is,  the  variance  of  the  selected  sample  is  the  same 
as  the  variance  of  the  unselected  sample.  From  Result  6,  /2  =  1,  the  modified  PL  formula  of  the 
covariance  between  y  and  x  has  a  scalar  factor  of  /i  to  the  PL  covariance.  This  /i  is  also  the  scale 
factor  which  affects  the  slope  jS.  It  can  be  seen  that  this  factor  is  a  convex  combination  of  1  and 
m  with  coeflScients  p  and  (1  p),  respectively.  If  m  is  less  than  1,  then  /i  <  1,  the  regression 

coefficients  from  using  the  selected  sample  or  using  the  PL  method  will  be  over  estimated.  On  the 
other  hand,  when  m  is  larger  than  1,  the  /?  will  be  underestimated  from  those  two  methods.  Since 
h  =  /i  4‘P(1  -  p)(l  -  mf  is  always  larger  than  /f,  will  be  smaller  than  if  m  <  1. 

Generally,  when  neither  m  nor  v  is  one,  /i,  /2  and  fs  are  not  necessarily  one.  Similarly  as 
above,  /i  is  the  scalar  factor  for  the  covariance  between  y  and  x  and  the  regression  coefficient 


15 


/?.  However,  the  comparison  for  and  is  not  so  clear  since  depends  on  both  v  and 
m.  In  Figure  1,  we  give  some  contour  plots  of  the  relative  bias  of  R^,  comparing  with  i.e. 

{Rp,  -  Rmpi)  /  Rmpi,  over  the  parameters  m  and  v  after  giving  p,  R^p,  and  cr^.  From  these  plots, 
we  can  see  that 

1.  In  general,  is  sensitive  to  the  variation  of  m  and  v.  For  most  of  the  given  values  of 
p  =  0.25  or  0.5  and  R^pj  =  0.25  or  0.5,  it  is  quite  possible  that  the  relative  bias  of  the  R^^ 
will  be  larger  than  10%  or  20%. 

2.  When  v  >  1  or  m  <  1,  the  relative  bias  is  positive,  which  "means  the  R^j  is  often  an 
overestimate  for  the  population  R^.  However,  it  can  be  an  underestimate  when  the  i;  <  1  or 
m  >  1, 

3.  The  relative  bias  becomes  large  when  the  selection  rate  p  is  small  or  the  population  R^  is 
small, 

I 

4.2  How  to  get  plausible  values  for  m  and  v 

After  seeing  the  sensitivity  of  /?  and  R^  to  the  values  of  m  and  v  in  the  modified  PL  method,  we 
now  discuss  on  how  to  obtain  some  plausible  values  for  the  parameters  m  and  v. 

First,  let  us  note  that  for  many  practical  situations,  the  average  score  of  unselected  sample  is 
often  smaller  than  that  of  selected  sample.  Hence  it  is  often  that  m  <  1.  On  the  other  hand,  the 
variance  of  the  performance  scores  from  the  unselected  sample  might  be  larger  than  that  of  the 
selected  sample,  that  is  v  will  be  usually  larger  than  1.  The  question  is  how  small  this  m  or  how 
large  that  v  could  be. 

It  is  obvious  that  we  need  at  least  two  pieces  of  information  to  determine  the  values  of  m  and 
V.  For  a  given  study,  people  may  have  some  prior  information  about  these  two  parameters.  If 
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reasonable  ranges  can  be  obtained  for  m  and  v  from  experts,  then  a  sensitivity  analysis  can  be 
given  for  the  regression  analysis. 

Here  is  a  practical  way  to  obtain  the  value  of  v.  As  in  the  definition,  v  is  the  ratio  of  the 
variance  of  unselected  sample  to  that  of  selected  sample.  In  many  situations,  it  might  be  often 
true  that  the  variation  ratio  of  y  is  similar  to  the  variation  ratio  of  x.  Then  the  v  value  can  be 
obtained  from  the  ratios  of  variances  of  x  between  unselected  and  selected  sample. 

For  the  value  of  m,  it  is  useful  to  look  at  the  correlation  of  y  and  the  selection  variable  s  given 
X.  This  is  just  the  correlation  of  y  and  the  residual  ^  of  regressing  s  on  x.  Under  the  model  (13) 
and  the  selection  mechanism  (6),  its  formula  is  given  at  (16).  For  given  x,  p  and  residual  variance 
<r^,  p  is  determined  by  m  and  v.  Figure  2  gives  some  contour  plots  of  p  over  various  m  and  v  given 
p  =  0.25  or  0.5,  x/?  =  1  or  2  and  =  1.  From  this  Figure,  we  see  that  p  is  not  very  sensitive 
to  v.  This  is  especially  so  when  v  is  larger  than  1.  Hence,  p  and  m  can  be  roughly  determined 
from  one  to  another.  When  m  <  1,  p  is  positive;  when  m  >  1,  p  is  negative.  In  other  words,  if 
the  selection  residual  is  positively  correlated  with  the  performance  score,  then  the  selected  sample 
will  have  a  better  average  performance  score  than  the  unselected  sample  for  given  values  of  x.  On 
the  other  hand,  when  the  unselected  sample  has  a  larger  average  score,  then  the  residual  6  might 
be  negatively  related  to  the  performance  score  y. 

For  a  given  data  set,  p  is  known,  and  x/?  and  (P'  can  be  estimated  from  using  the  selected 
sample.  Then  a  figure  of  /?  on  m  and  v  can  be  given  as  in  Figure  2. 

Note  that  p  is  a  correlation  coefficient  of  y  and  6.  The  residual  b  can  be  viewed  as  the  con¬ 
tribution  from  several  other  exogenous  variables  which  are  not  used  in  the  selection.  Some  prior 
information  about  those  extra  exogenous  variables  might  be  available  from  the  researchers  who 
design  or  perform  the  selection.  For  example,  they  may  give  a  rough  range  for  how  many  are  the 
other  variables  that  are  not  used  in  the  selection  and  how  much  proportion  will  the  other  variables 
contribute  to  the  selection  comparing  with  the  variables  which  are  used  in  the  selection. 
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Remark  2.  Since  the  formula  of  p  and  Vmpi  are  not  simple  functions  of  m  and  v,  it  is  impossible  to 
solve  them  for  m  and  v.  It  is  not  necessary,  however,  to  get  an  exact  solution  for  given  p  because 
both  the  V  and  m  are  only  estimates.  The  contour  curves  shall  be  enough  to  give  rough  values  for 
the  parameters  m  and  v.  Moreover,  the  contour  plots  will  show  some  sensitivity  results. 


5  Application  to  an  ONR  data  set 

To  illustrate  the  above  approach,  we’- apply  this  method  to  a  data  set  of  Batch  A  of  the  Project 
A  Concurrent  Validity  Study  (  see  Young,  Houston,  Harris,  Hoffman  &  Wise  (1990)  ).  This  data 
set  included  Hands-on  Job  Performance  score,  10  sub  test  scores  of  the  Armed  Services  Vocational 
Aptitude  Battery  (ASVAB)  and  some  other  test  scores  and  background  variables  for  nine  different 
jobs.  There  are  4039  total  observations  in  this  dataset.  The  observations  are  randomly  sampled 
from  the  enlisted  military  persons. 

We  are  interested  in  a  regression  analysis  for  the  whole  population  of  the  Hands-on  job  per¬ 
formance  score  on  the  ASVAB  subtest  scores.  The  10  ASVAB  sub  tests  are  Arithmetic  Reasoning 
(AR),  Auto  h  Shop  Information  (AS),  Coding  Speed  (CS),  Electronics  Information  (El),  General 
Science  (GS),  Mechanical  Comprehension  (MC),  Mathematics  Knowledge  (MK),  Numerical  Op¬ 
erations  (NO),  Paragraph  Comprehension  (PC)  and  Word  Knowledge  (WK).  For  these  ASVAB 
variables,  there  is  a  reference  population  for  the  selected  4039  sample,  which  is  from  the  all  650,278 
military  applicants  of  1991  fiscal  year.  The  means  and  covariance  matrix  of  the  ASVAB  from  the 
650,278  applicants  is  given  in  Table  1  (a)  (see  Wolfe  et  al.,  (1993)).  This  sample  is  taken  to  be 
the  population  from  which  all  the  military  enlistments  are  selected  and  the  4039  Batch  A  persons 
are  sampled.  By  doing  this,  we  assume  that  the  distribution  of  the  ASVAB  subtest  scores  for  all 
military  applicants  will  be  similar  in  a  consecutive  years. 


18 


First,  Table  1  (b)  gives  the  mean  and  covariance  matrix  for  the  4039  selected  sample.  This 
shall  be  a  consistent  estimate  for  the  covariance  matrix  of  ASVAB  of  the  total  enlisted  population. 
It  can  be  seen  that  there  is  some  big  difference  between  the  population  covariance  matrix  and  this 
covariance  matrix  of  the  selected  sample.  This  implies  that  the  selected  (or  enlisted  persons)  are 
not  just  a  random  sample  from  the  applicant  population.  In  fact,  the  selection  is  based  on  the 
ASVAB  sub  tests  and  some  other  variables. 

To  check  the  linearity  and  normality,  we  perform  a  regression  analysis  of  y  (hands-on  job 
performance  score)  on  the  ASVAB  for  the  selected  4039  sample.  The  residuals  of  this  regression  is 
plotted  against  the  predicted  y  values  on  Figure  3.  This  residual  plot  looks  quite  normal.  There 
is  no  clear  violation  to  the  assumption  of  linearity.  Figure  4  gives  the  histogram  and  its  smooth 
density  curve  for  those  residuals.  Although  the  density  curve  show  a  slight  skewness  to  the  left,  it 
is  still  symmetric.  So  the  normality  assumption  for  the  residuals  might  be  reasonable. 

In  order  to  give  a  sensitivity  analysis,  we  need  assess  the  values  for  m  and  v.  Looking  at  the 
ratios  of  variances  for  the  10  ASVAB  variables  between  the  population  and  the  selected  sample,  we 
find  that  they  are  between  1.19  and  1.47.  Since  the  ratio  of  the  variances  between  the  unselected 
and  selected  sample  may  be  smaller  than  that  between  the  population  and  the  selected  sample,  we 
will  expect  that  the  values  of  v  is  between  1  and  1.47.  For  the  following  analysis,  we  will  take  the 
range  of  v  to  be  from  1  to  1.5, 

From  the  regression  analysis  of  y  on  the  ASVAB  variables  for  the  selected  sample,  we  obtain 
the  following  information:  the  residual  variance  cr*^  =  92.05,  the  variance  of  y  from  the  selected 
sample  Vs  =  108.24,  the  adjusted  mean  of  y,  fiy  =  =  69.77,  and  the  = 

18.79.  Finally,  according  to  the  military  enlistment  officers,  the  rate  of  enlistment  to  all  military 
jobs  IS  about  33%.  This  rate  can  be  viewed  as  the  selection  rate  p  since  the  4039  observations  are 
randomly  sampled  from  the  enlisted  population.  Then  the  contour  plot  of  /?  =  corr(y^  s|x)  can  be 
obtained,  which  is  given  in  Figure  5. 
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Suppose  the  prior  information  about  the  p  is  that  p  will  not  be  larger  than  20%  and  the 
correlation  is  non-negative.  Then  from  the  Figure  3,  we  know  that  the  range  for  m  is  between  0.95 
and  1.0.  With  these  values  of  m  and  v,  Table  2  gives  several  adjusted  covariance  between  Hands-on 
score  and  the  ASVAB  subtest  variables.  Since  the  values  of  m  is  close  to  1,  the  covariances  oiy  and 
the  ASVAB  do  not  differ  a  lot.  However,  the  difference  of  the  var{y)  between  the  Pearson-Lawley 
method  and  the  modified  Pearson-Lawley  method  is  apparent. 

In  the  Table  3,  the  standardized  regression  coefficients  and  their  t- values  are  listed.  In  the  last 
row,  the  i2^’s  of  the  regression  are  given.  We  can  see  that  the  modified  Pearson-Lawley  method 

fA  '  «' 

will  have  slightly  smaller  values  for  the  standardized  coefficients  of  /?.  However,  the  difference  for 
E?  is  clearer.  When  v  =  1.5,  the  relative  difference  between  the  PL  method  and  the  modified  PL 
method  is  about  36%.  This  is  because  the  modification  assumes  that  the  mean  of  the  unselected 
sample  is  smaller  than  the  selected  sample,  while  the  variance  of  the  unselected  sample  is  larger 
than  the  selected  sample.  Both  of  these  assumptions  are  quite  reasonable  for  a  selection  problem 
like  this. 


6  Conclusion 

In  the  aboye  study,  we  considered  effects  of  the  nonignorable  selection  on  the  Pearson-Lawley 
adjustment  formula  for  a  covariance  matrix.  The  PL  formula  gives  a  good  correction  for  the 
selection  bias  when  the  selection  is  at  random,  that  is,  the  missingness  is  ignorable.  However, 
when  this  condition  is  not  satisfied,  the  PL  formula  may  be  biased.  The  bias  will  depend  on  the 
model  specification  and  the  selection  mechanism. 

A  mixture  model  with  a  logistic  selection  probability  is  proposed  in  this  paper.  Based  on  this, 
a  modified  Pearson-Lawley  formula  is  derived.  This  gives  a  further  correction  to  the  PL  method 
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when  the  missingness  is  not  at  random.  From  the  sensitivity  analysis,  we  can  see  to  which  degree 
the  PL  formula  will  be  biased  when  the  selected  sample  has  different  mean  or  variance  than  that 
of  the  unselected  sample.  For  some  cases,  this  bias  might  be  serious.  Typically,  the  relative  bias 
of  E?  for  a  regression  can  easily  be  20  or  30  percent. 

■  To  get  the  information  of  the  modification  parameters,  one  may  assume  that  the  variance 
ratio  of  the  dependent  variable  between  the  population  and  the  selected  sample  being  similar  to 
that  of  independent  variables.  This  will  provide  a  reasonable  range  for  one  of  the  modification 
parameters.  The  other  parameter  may  be  accessed  from  the  prior  information  about  the  correlation 

V 

of  the  performance  variable  y  with  the  residual  of  the  selection  variance  s. 

Finally,  it  is  very  common  in  practice  that  the  selected  and  unselected  samples  have  differ¬ 
ent  means  and  variances.  In  fact,  it  is  the  goal  of  a  selection  procedure  to  choose  some  special 
candidates  who  have  better  performance  ability  from  a  population.  Hence  it  is  quite  often  that 
a  selection  will  be  not  at  random  or  the  missing  is  not  ignorable.  It  shall  be  valuable  to  have 
some  further  investigate  for  the  selection  mechanism  and  use  an  appropriate  modification  on  the 
analysis  to  a  data  set  which  involves  selection. 
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Figure  1:  Some  contour  plots  of  the  relative  bias  of  R^i  vs  R^pj  on  the  parameters  m  and  v  for 
given  p,  ^mpl  ^  • 
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Figure  2:  Some  contour  plots  of  the  correlation  p  =  corr{y^  on  the  parameters  m  and  v  for 

given  p,  x/?  and  cr^ . 
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Table  1  a).  Mean  and  covariance  matrix  of  the  ASVAB 
from  the  population  sample  (n=650,278)'^ 


AR 

AS 

CS 

El 

GS 

MC 

MK 

NO 

PC 

WK 

AR 

74.743 

31.736 

26.701 

37.271 

46.346 

48.406 

53.106 

32.583 

39.540 

37.910 

AS 

31.736 

84.047 

4.174 

54.333 

41.836 

51.718 

15.659 

3.449 

24.755 

29.437 

CS 

26.701 

4.174 

61.025 

10.175 

17.046 

15.769 

27.678 

40.069 

24.006 

18.832 

El 

37.271 

54.333 

10.175 

78.428 

48.520 

50.956 

28.442 

10.307 

31.348 

34.803 

GS 

46.346 

41.836 

17.046 

48.520 

76.959 

51.060 

42.246 

19.338 

42.469 

46.458 

MC 

48.406 

51.718 

15.769 

50.956 

51.060 

83.306 

39.167 

16.667 

35.271 

36.735 

MK 

53.106 

15.659 

27.678 

28.442 

42.246 

39.167 

75.500 

34.543 

34.581 

31.744 

NO 

32.583 

3.449 

40.069 

10.307 

19.338 

16.667 

34.543 

64.209 

25.266 

19.116 

PC 

39.540 

24.755 

24.006 

31.348 

42.469 

35.271 

34.581 

25.266 

63.426 

42.849 

WK 

37.910 

29.437 

18.832 

34.803 

46.458 

36.735 

31.744 

19.116 

42.849 

54.083 

Mean 

50.664 

51.409 

52.266 

50.333 

50.615 

51.941 

51.210 

52.512 

51.156 

51.310 

*  Source:  Table  A-1  of  SCAT  Draft  report  by  Wolfe,  et.  al.  (1993). 


Table  1  b).  Mean  and  covariance  matrix  of  the  ASVAB 
from  the  selected  sample  (n=4039) 


AR 

AS 

CS 

El 

GS 

MC 

MK 

NO 

PC 

WK 

AR 

51.548 

19.597 

8.621 

18.701 

28.307 

27.887 

37.579 

10.216 

20.553 

20.925 

AS 

19.597 

70.809 

-5.110 

36.498 

31.474 

38.293 

11.789 

-7.986 

15.709 

18.073 

CS 

8.621 

-5.110 

43.887 

-1.333 

2.060 

-0.282 

10.810 

22.190 

6.799 

3.131 

El 

18.701 

36.498 

-1.333 

54.846 

31.354 

34.859 

17.237 

-4.385 

16.573 

21.284 

GS 

28.307 

31.474 

2.060 

31.354 

63.783 

34.049 

28.528 

-0.058 

29.167 

37.200 

MC 

27.887 

38.293 

-0.282 

34.859 

34.049 

63.253 

26.133 

-3.704 

18.858 

22.077 

MK 

37.579 

11.789 

10.810 

17.237 

28.528 

26.133 

53.213 

13.307 

19.818 

21.583 

NO 

10.216 

-7.986 

22.190 

M.385 

-0.058 

-3.704 

13.307 

40.591 

2.553 

0.043 

PC 

20.553 

15.709 

6.799 

16.573 

29.167 

18.858 

19.818 

2.553 

43.046 

27.389 

WK 

20.925 

18.073 

3.131 

21.284 

37.200 

22.077 

21.583 

0.043 

27.389 

43.893 

Mean 

53.161 

54.484 

51.661 

52.158 

51.786 

53.467 

51.221 

52.762 

51.787 

50.920 
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Table  2:  Covariances  and  their  adjustements 
between  Hands-on  performance  scores  and  the  ASVAB 
varaibles  for  different  v  and  m  values 


PL 

Modified  PL 

Para  v 

1 

1.5 

1.5 

1 

Para  m 

1 

1 

0.95  • 

0.95 

factor  fl 

1.000 

1.000 

0.967 

0.967 

factor  f2 

1.000 

1.335 

1.335  ■■ 

1.000 

factor  f3 

1.000 

1.000 

0.935 

0.935 

HDON  HDON 

108.237 

110.839 

141.674 

140.447 

109.611 

HDON  AR 

11.956 

17.636 

17.636 

17.046 

17.046 

HDON  AS 

30.546 

35.740 

35.740 

34.543  ' 

34.543 

HDON  CS 

-1.887 

2.567 

2.567 

2.481 

2.481 

HDON  El 

19.576 

27.316 

27.316 

26.401 

26.401 

HDON  GS 

16.127 

22.023 

22.023 

21.285 

21.285 

HDON  MC 

24.673 

30.906 

30.906 

29.871 

29.871 

HDON  MK 

11.841 

15.773 

15.773 

15.244 

15.244 

HDON  NO 

-3592 

1.925 

1.925 

1.860 

1.860 

HDON  PC 

4.868 

8.528 

8.528 

8.242 

8.242 

HDON  WK 

6.073 

10.700 

10.700 

10.341 

10.341 
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Table  3:  Standardized  regression  coefficients,  t-values  and  R^2 
for  different  choices  of  v  and  m  values 


No_adj 

PL 

Modified  PL 

V 

1 

1.5 

1.5 

1 

m 

1 

1 

0.95 

0.95 

fi 

1 

1.000 

0.967 

0.967 

f2 

1 

1.335 

1.335 

1.000 

f3 

1 

1.000 

0.935 

0.935 

Beta 

m 

Beta 

t 

Beta  t 

Beta 

t 

Beta 

t 

AR 

-0.027 

-1.211 

-0.032 

-1.312 

-0.029  -1.140 

.  -0.028 

-1.102 

-0.031 

-1,269 

AS 

0.270 

13.604 

0.291 

13.388 

0.257  11.627 

0.250 

11.245 

0.283 

12,950 

CS 

0.008 

0.478 

0.010 

0.497 

0.009  0.432 

0.008 

0.418 

0.009 

0.481 

El 

0.034 

1.665 

0.040 

1.773 

0.035  1.540 

0.034 

1,489 

0.039 

1.715 

GS 

0.043 

1.848 

0.047 

1.890 

0.042  1.641 

0.041 

1.587 

0.046 

1.828 

MC 

0.124 

5.872 

0.141 

6,117 

0.125  5313 

0.121 

5.138 

0.137 

5.917 

MK 

0.109 

4.811 

0.129 

5.669 

0.114  4.923 

0.110 

4.762 

0.125 

5.484 

NO 

-0.028 

-1.531 

-0.034 

-1.672 

-0.030  -1.452 

-0.029 

-1.404 

-0.033 

-1.617 

PC 

-0.057 

-2.929 

-0.069 

-3.038 

-0.061  -Z638 

-0.059 

-2.552 

-0.067 

-2.939 

WK 

-0.098 

-4.333 

-0.107 

-4.252 

-0.095  -3.693 

-0.092 

-3.572 

-0.104 

^.113 

R^2 

0.150 

0.146 

0.114 

0.108 

0.138 

